Combining Local and Global KNN With Cotraining
نویسندگان
چکیده
Semi-supervised learning is a machine learning paradigm in which the induced hypothesis is improved by taking advantage of unlabeled data. It is particularly useful when labeled data is scarce. Cotraining is a widely adopted semi-supervised approach that assumes availability of two views of the training data a restrictive assumption for most real world tasks. In this paper, we propose a one-view Cotraining approach that combines two different k-Nearest Neighbors (KNN) strategies referred to as global and local k-NN. In global KNN, the nearest neighbors selected to classify a new instance are given by the training examples which include this instance as one of their own k-nearest neighbors. In local KNN, on the other hand, the neighborhood considered when classifying a new instance is computed with the traditional KNN approach. We carried out experiments showing that a combination of these strategies significantly improves the classification accuracy in Cotraining, particularly when one single view of training data is available. We also introduce an optimized algorithm to cope with time complexity of computing the global KNN, which enables tackling real classification problems.
منابع مشابه
An Optimal Approach to Local and Global Text Coherence Evaluation Combining Entity-based, Graph-based and Entropy-based Approaches
Text coherence evaluation becomes a vital and lovely task in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging with nouns and noun phrases change role in sequential sentences within short part of a text. They even have limitations in global coheren...
متن کاملLink Prediction using Network Embedding based on Global Similarity
Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...
متن کاملCombining Multiple and Classifiers for Increasing Accuracy for
In this paper we combining statistical, structural Global transformation and moments features to form hybrid feature vector .We are combining Classifiers for achieving high accuracy for Devanagari Script. To abolish the hitch of misclassification and increase the classifier accurac combining SVM and KNN together. The dataset used for experiment are created by us.
متن کاملGeometric k-nearest neighbor estimation of entropy and mutual information
Nonparametric estimation of mutual information is used in a wide range of scientific problems to quantify dependence between variables. The k-nearest neighbor (knn) methods are consistent, and therefore expected to work well for a large sample size. These methods use geometrically regular local volume elements. This practice allows maximum localization of the volume elements, but can also induc...
متن کاملComparison of Four Methods for Premature Ventricular Contractions and Normal Beats Clustering
The learning capacity and the classification ability for normal beats and premature ventricular contractions clustering by four classification methods were compared: neural networks (NN), K-th nearest neighbour rule (Knn), discriminant analysis (DA) and fuzzy logic (FL). Twenty-six morphology feature parameters, which include information of amplitude, area, specific interval durations and measu...
متن کامل